Overview

Brought to you by YData

Dataset statistics

Number of variables26
Number of observations1000000
Missing cells0
Missing cells (%)0.0%
Duplicate rows9569
Duplicate rows (%)1.0%
Total size in memory158.3 MiB
Average record size in memory166.0 B

Variable types

Numeric10
Categorical16

Alerts

Dataset has 9569 (1.0%) duplicate rowsDuplicates
NU_DESEMPENHO is highly overall correlated with NU_MEDIA_GERAL and 10 other fieldsHigh correlation
NU_INFRAESTRUTURA is highly overall correlated with Q006High correlation
NU_MEDIA_GERAL is highly overall correlated with NU_DESEMPENHO and 10 other fieldsHigh correlation
NU_NOTA_CH is highly overall correlated with NU_DESEMPENHO and 10 other fieldsHigh correlation
NU_NOTA_CN is highly overall correlated with NU_DESEMPENHO and 10 other fieldsHigh correlation
NU_NOTA_LC is highly overall correlated with NU_DESEMPENHO and 10 other fieldsHigh correlation
NU_NOTA_MT is highly overall correlated with NU_DESEMPENHO and 10 other fieldsHigh correlation
NU_NOTA_REDACAO is highly overall correlated with NU_DESEMPENHO and 10 other fieldsHigh correlation
Q006 is highly overall correlated with NU_INFRAESTRUTURAHigh correlation
TP_ANO_CONCLUIU is highly overall correlated with TP_FAIXA_ETARIAHigh correlation
TP_FAIXA_ETARIA is highly overall correlated with TP_ANO_CONCLUIU and 1 other fieldsHigh correlation
TP_PRESENCA_CH is highly overall correlated with NU_DESEMPENHO and 10 other fieldsHigh correlation
TP_PRESENCA_CN is highly overall correlated with NU_DESEMPENHO and 10 other fieldsHigh correlation
TP_PRESENCA_GERAL is highly overall correlated with NU_DESEMPENHO and 10 other fieldsHigh correlation
TP_PRESENCA_LC is highly overall correlated with NU_DESEMPENHO and 10 other fieldsHigh correlation
TP_PRESENCA_MT is highly overall correlated with NU_DESEMPENHO and 10 other fieldsHigh correlation
TP_ST_CONCLUSAO is highly overall correlated with TP_FAIXA_ETARIAHigh correlation
TP_ESTADO_CIVIL is highly imbalanced (70.8%) Imbalance
TP_DEPENDENCIA_ADM_ESC is highly imbalanced (53.6%) Imbalance
Q025 is highly imbalanced (54.7%) Imbalance
TP_COR_RACA has 13385 (1.3%) zeros Zeros
NU_NOTA_REDACAO has 29787 (3.0%) zeros Zeros
TP_ANO_CONCLUIU has 569476 (56.9%) zeros Zeros

Reproduction

Analysis started2025-04-20 13:00:14.583380
Analysis finished2025-04-20 13:02:53.659626
Duration2 minutes and 39.08 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

TP_FAIXA_ETARIA
Real number (ℝ)

High correlation 

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.097267
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.6 MiB
2025-04-20T10:02:53.896784image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q37
95-th percentile13
Maximum20
Range19
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.8733914
Coefficient of variation (CV)0.75989573
Kurtosis0.33366307
Mean5.097267
Median Absolute Deviation (MAD)1
Skewness1.1695409
Sum5097267
Variance15.003161
MonotonicityNot monotonic
2025-04-20T10:02:54.088207image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
3 229615
23.0%
2 191545
19.2%
4 109620
11.0%
1 88683
 
8.9%
5 67861
 
6.8%
11 62794
 
6.3%
6 46631
 
4.7%
7 35251
 
3.5%
12 33905
 
3.4%
8 28432
 
2.8%
Other values (10) 105663
10.6%
ValueCountFrequency (%)
1 88683
 
8.9%
2 191545
19.2%
3 229615
23.0%
4 109620
11.0%
5 67861
 
6.8%
6 46631
 
4.7%
7 35251
 
3.5%
8 28432
 
2.8%
9 23165
 
2.3%
10 18672
 
1.9%
ValueCountFrequency (%)
20 244
 
< 0.1%
19 550
 
0.1%
18 1382
 
0.1%
17 3441
 
0.3%
16 6316
 
0.6%
15 10393
 
1.0%
14 16912
 
1.7%
13 24588
 
2.5%
12 33905
3.4%
11 62794
6.3%

TP_SEXO
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size976.8 KiB
F
612211 
M
387789 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowF
3rd rowM
4th rowM
5th rowF

Common Values

ValueCountFrequency (%)
F 612211
61.2%
M 387789
38.8%

Length

2025-04-20T10:02:54.308110image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:02:54.482419image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
f 612211
61.2%
m 387789
38.8%

Most occurring characters

ValueCountFrequency (%)
F 612211
61.2%
M 387789
38.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
F 612211
61.2%
M 387789
38.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
F 612211
61.2%
M 387789
38.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
F 612211
61.2%
M 387789
38.8%

TP_ESTADO_CIVIL
Categorical

Imbalance 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
1
887575 
2
 
50949
0
 
43691
3
 
16595
4
 
1190

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 887575
88.8%
2 50949
 
5.1%
0 43691
 
4.4%
3 16595
 
1.7%
4 1190
 
0.1%

Length

2025-04-20T10:02:54.631606image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:02:54.860894image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 887575
88.8%
2 50949
 
5.1%
0 43691
 
4.4%
3 16595
 
1.7%
4 1190
 
0.1%

Most occurring characters

ValueCountFrequency (%)
1 887575
88.8%
2 50949
 
5.1%
0 43691
 
4.4%
3 16595
 
1.7%
4 1190
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 887575
88.8%
2 50949
 
5.1%
0 43691
 
4.4%
3 16595
 
1.7%
4 1190
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 887575
88.8%
2 50949
 
5.1%
0 43691
 
4.4%
3 16595
 
1.7%
4 1190
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 887575
88.8%
2 50949
 
5.1%
0 43691
 
4.4%
3 16595
 
1.7%
4 1190
 
0.1%

TP_COR_RACA
Real number (ℝ)

Zeros 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.058675
Minimum0
Maximum5
Zeros13385
Zeros (%)1.3%
Negative0
Negative (%)0.0%
Memory size7.6 MiB
2025-04-20T10:02:55.075957image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q33
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.0035982
Coefficient of variation (CV)0.48749713
Kurtosis-1.2379797
Mean2.058675
Median Absolute Deviation (MAD)1
Skewness0.049803775
Sum2058675
Variance1.0072093
MonotonicityNot monotonic
2025-04-20T10:02:55.312302image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
3 434023
43.4%
1 400501
40.1%
2 129253
 
12.9%
4 16591
 
1.7%
0 13385
 
1.3%
5 6247
 
0.6%
ValueCountFrequency (%)
0 13385
 
1.3%
1 400501
40.1%
2 129253
 
12.9%
3 434023
43.4%
4 16591
 
1.7%
5 6247
 
0.6%
ValueCountFrequency (%)
5 6247
 
0.6%
4 16591
 
1.7%
3 434023
43.4%
2 129253
 
12.9%
1 400501
40.1%
0 13385
 
1.3%

TP_DEPENDENCIA_ADM_ESC
Categorical

Imbalance 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
-1.0
757052 
2.0
170030 
4.0
 
58139
1.0
 
12410
3.0
 
2369

Length

Max length4
Median length4
Mean length3.757052
Min length3

Characters and Unicode

Total characters3757052
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row-1.0
3rd row-1.0
4th row-1.0
5th row2.0

Common Values

ValueCountFrequency (%)
-1.0 757052
75.7%
2.0 170030
 
17.0%
4.0 58139
 
5.8%
1.0 12410
 
1.2%
3.0 2369
 
0.2%

Length

2025-04-20T10:02:55.566044image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:02:55.784435image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0 769462
76.9%
2.0 170030
 
17.0%
4.0 58139
 
5.8%
3.0 2369
 
0.2%

Most occurring characters

ValueCountFrequency (%)
. 1000000
26.6%
0 1000000
26.6%
1 769462
20.5%
- 757052
20.2%
2 170030
 
4.5%
4 58139
 
1.5%
3 2369
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3757052
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
. 1000000
26.6%
0 1000000
26.6%
1 769462
20.5%
- 757052
20.2%
2 170030
 
4.5%
4 58139
 
1.5%
3 2369
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3757052
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
. 1000000
26.6%
0 1000000
26.6%
1 769462
20.5%
- 757052
20.2%
2 170030
 
4.5%
4 58139
 
1.5%
3 2369
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3757052
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
. 1000000
26.6%
0 1000000
26.6%
1 769462
20.5%
- 757052
20.2%
2 170030
 
4.5%
4 58139
 
1.5%
3 2369
 
0.1%

TP_ST_CONCLUSAO
Categorical

High correlation 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
1
482464 
2
355120 
3
157951 
4
 
4465

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row3
3rd row1
4th row1
5th row2

Common Values

ValueCountFrequency (%)
1 482464
48.2%
2 355120
35.5%
3 157951
 
15.8%
4 4465
 
0.4%

Length

2025-04-20T10:02:55.996179image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:02:56.193203image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 482464
48.2%
2 355120
35.5%
3 157951
 
15.8%
4 4465
 
0.4%

Most occurring characters

ValueCountFrequency (%)
1 482464
48.2%
2 355120
35.5%
3 157951
 
15.8%
4 4465
 
0.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 482464
48.2%
2 355120
35.5%
3 157951
 
15.8%
4 4465
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 482464
48.2%
2 355120
35.5%
3 157951
 
15.8%
4 4465
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 482464
48.2%
2 355120
35.5%
3 157951
 
15.8%
4 4465
 
0.4%

SG_UF_PROVA
Categorical

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size977.9 KiB
SP
150003 
MG
91151 
BA
82841 
RJ
72249 
CE
61938 
Other values (22)
541818 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters2000000
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRJ
2nd rowSP
3rd rowMA
4th rowPE
5th rowPR

Common Values

ValueCountFrequency (%)
SP 150003
15.0%
MG 91151
 
9.1%
BA 82841
 
8.3%
RJ 72249
 
7.2%
CE 61938
 
6.2%
PA 57908
 
5.8%
PE 55541
 
5.6%
PR 42307
 
4.2%
MA 42094
 
4.2%
RS 40299
 
4.0%
Other values (17) 303669
30.4%

Length

2025-04-20T10:02:56.379096image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sp 150003
15.0%
mg 91151
 
9.1%
ba 82841
 
8.3%
rj 72249
 
7.2%
ce 61938
 
6.2%
pa 57908
 
5.8%
pe 55541
 
5.6%
pr 42307
 
4.2%
ma 42094
 
4.2%
rs 40299
 
4.0%
Other values (17) 303669
30.4%

Most occurring characters

ValueCountFrequency (%)
P 370060
18.5%
S 260654
13.0%
A 241040
12.1%
R 194480
9.7%
M 185562
9.3%
E 152588
7.6%
G 129050
 
6.5%
B 114415
 
5.7%
C 91310
 
4.6%
J 72249
 
3.6%
Other values (7) 188592
9.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
P 370060
18.5%
S 260654
13.0%
A 241040
12.1%
R 194480
9.7%
M 185562
9.3%
E 152588
7.6%
G 129050
 
6.5%
B 114415
 
5.7%
C 91310
 
4.6%
J 72249
 
3.6%
Other values (7) 188592
9.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
P 370060
18.5%
S 260654
13.0%
A 241040
12.1%
R 194480
9.7%
M 185562
9.3%
E 152588
7.6%
G 129050
 
6.5%
B 114415
 
5.7%
C 91310
 
4.6%
J 72249
 
3.6%
Other values (7) 188592
9.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
P 370060
18.5%
S 260654
13.0%
A 241040
12.1%
R 194480
9.7%
M 185562
9.3%
E 152588
7.6%
G 129050
 
6.5%
B 114415
 
5.7%
C 91310
 
4.6%
J 72249
 
3.6%
Other values (7) 188592
9.4%

TP_PRESENCA_CN
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
1
684174 
0
315276 
2
 
550

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 684174
68.4%
0 315276
31.5%
2 550
 
0.1%

Length

2025-04-20T10:02:56.581701image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:02:56.754349image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 684174
68.4%
0 315276
31.5%
2 550
 
0.1%

Most occurring characters

ValueCountFrequency (%)
1 684174
68.4%
0 315276
31.5%
2 550
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 684174
68.4%
0 315276
31.5%
2 550
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 684174
68.4%
0 315276
31.5%
2 550
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 684174
68.4%
0 315276
31.5%
2 550
 
0.1%

TP_PRESENCA_CH
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
1
717365 
0
281489 
2
 
1146

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 717365
71.7%
0 281489
 
28.1%
2 1146
 
0.1%

Length

2025-04-20T10:02:56.917091image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:02:57.054110image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 717365
71.7%
0 281489
 
28.1%
2 1146
 
0.1%

Most occurring characters

ValueCountFrequency (%)
1 717365
71.7%
0 281489
 
28.1%
2 1146
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 717365
71.7%
0 281489
 
28.1%
2 1146
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 717365
71.7%
0 281489
 
28.1%
2 1146
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 717365
71.7%
0 281489
 
28.1%
2 1146
 
0.1%

TP_PRESENCA_LC
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
1
717365 
0
281489 
2
 
1146

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 717365
71.7%
0 281489
 
28.1%
2 1146
 
0.1%

Length

2025-04-20T10:02:57.211013image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:02:57.362400image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 717365
71.7%
0 281489
 
28.1%
2 1146
 
0.1%

Most occurring characters

ValueCountFrequency (%)
1 717365
71.7%
0 281489
 
28.1%
2 1146
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 717365
71.7%
0 281489
 
28.1%
2 1146
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 717365
71.7%
0 281489
 
28.1%
2 1146
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 717365
71.7%
0 281489
 
28.1%
2 1146
 
0.1%

TP_PRESENCA_MT
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
1
684174 
0
315276 
2
 
550

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 684174
68.4%
0 315276
31.5%
2 550
 
0.1%

Length

2025-04-20T10:02:57.529597image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:02:57.722863image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 684174
68.4%
0 315276
31.5%
2 550
 
0.1%

Most occurring characters

ValueCountFrequency (%)
1 684174
68.4%
0 315276
31.5%
2 550
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 684174
68.4%
0 315276
31.5%
2 550
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 684174
68.4%
0 315276
31.5%
2 550
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 684174
68.4%
0 315276
31.5%
2 550
 
0.1%

NU_NOTA_CN
Real number (ℝ)

High correlation 

Distinct1402
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean338.95541
Minimum-1
Maximum868.5
Zeros4111
Zeros (%)0.4%
Negative315826
Negative (%)31.6%
Memory size7.6 MiB
2025-04-20T10:02:57.961720image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median445
Q3524
95-th percentile615.5
Maximum868.5
Range869.5
Interquartile range (IQR)525

Descriptive statistics

Standard deviation242.11099
Coefficient of variation (CV)0.71428568
Kurtosis-1.3920365
Mean338.95541
Median Absolute Deviation (MAD)106
Skewness-0.53767043
Sum3.3895541 × 108
Variance58617.733
MonotonicityNot monotonic
2025-04-20T10:02:58.214618image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1 315826
31.6%
0 4111
 
0.4%
529.5 1568
 
0.2%
523.5 1559
 
0.2%
521 1558
 
0.2%
531 1557
 
0.2%
513 1553
 
0.2%
516.5 1547
 
0.2%
522.5 1539
 
0.2%
528 1539
 
0.2%
Other values (1392) 667643
66.8%
ValueCountFrequency (%)
-1 315826
31.6%
0 4111
 
0.4%
320.25 1
 
< 0.1%
321 1
 
< 0.1%
322.5 1
 
< 0.1%
323.25 38
 
< 0.1%
323.5 123
 
< 0.1%
323.75 74
 
< 0.1%
324 97
 
< 0.1%
324.25 45
 
< 0.1%
ValueCountFrequency (%)
868.5 1
 
< 0.1%
856.5 1
 
< 0.1%
854.5 3
 
< 0.1%
854 8
< 0.1%
844.5 4
< 0.1%
843.5 5
< 0.1%
843 2
 
< 0.1%
842 2
 
< 0.1%
840.5 1
 
< 0.1%
839.5 1
 
< 0.1%

NU_NOTA_CH
Real number (ℝ)

High correlation 

Distinct1432
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean375.11279
Minimum-1
Maximum823
Zeros1447
Zeros (%)0.1%
Negative282635
Negative (%)28.3%
Memory size7.6 MiB
2025-04-20T10:02:58.750418image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median483.75
Q3562.5
95-th percentile644
Maximum823
Range824
Interquartile range (IQR)563.5

Descriptive statistics

Standard deviation247.71502
Coefficient of variation (CV)0.66037477
Kurtosis-1.1996348
Mean375.11279
Median Absolute Deviation (MAD)104
Skewness-0.69057496
Sum3.7511279 × 108
Variance61362.732
MonotonicityNot monotonic
2025-04-20T10:02:58.955486image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1 282635
28.3%
538 1775
 
0.2%
543.5 1763
 
0.2%
540.5 1758
 
0.2%
535 1758
 
0.2%
538.5 1756
 
0.2%
542.5 1752
 
0.2%
541 1750
 
0.2%
536.5 1744
 
0.2%
547 1742
 
0.2%
Other values (1422) 701567
70.2%
ValueCountFrequency (%)
-1 282635
28.3%
0 1447
 
0.1%
290 1
 
< 0.1%
293.5 136
 
< 0.1%
293.75 17
 
< 0.1%
294 23
 
< 0.1%
294.25 16
 
< 0.1%
294.5 39
 
< 0.1%
294.75 12
 
< 0.1%
295 30
 
< 0.1%
ValueCountFrequency (%)
823 16
< 0.1%
805 8
 
< 0.1%
804.5 18
< 0.1%
804 8
 
< 0.1%
800.5 18
< 0.1%
799.5 20
< 0.1%
798.5 3
 
< 0.1%
794.5 9
< 0.1%
794 1
 
< 0.1%
793.5 3
 
< 0.1%

NU_NOTA_LC
Real number (ℝ)

High correlation 

Distinct1418
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean371.38743
Minimum-1
Maximum821
Zeros554
Zeros (%)0.1%
Negative282635
Negative (%)28.3%
Memory size7.6 MiB
2025-04-20T10:02:59.145814image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median484
Q3550.5
95-th percentile622
Maximum821
Range822
Interquartile range (IQR)551.5

Descriptive statistics

Standard deviation242.31092
Coefficient of variation (CV)0.65244781
Kurtosis-1.1718687
Mean371.38743
Median Absolute Deviation (MAD)88
Skewness-0.75203672
Sum3.7138743 × 108
Variance58714.58
MonotonicityNot monotonic
2025-04-20T10:02:59.341804image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1 282635
28.3%
527 2067
 
0.2%
537 2044
 
0.2%
524 2041
 
0.2%
538.5 2033
 
0.2%
526 2029
 
0.2%
538 2018
 
0.2%
535.5 2017
 
0.2%
517 2011
 
0.2%
514.5 2010
 
0.2%
Other values (1408) 699095
69.9%
ValueCountFrequency (%)
-1 282635
28.3%
0 554
 
0.1%
287 1
 
< 0.1%
287.25 45
 
< 0.1%
287.5 4
 
< 0.1%
287.75 14
 
< 0.1%
288 8
 
< 0.1%
288.25 10
 
< 0.1%
288.5 36
 
< 0.1%
288.75 15
 
< 0.1%
ValueCountFrequency (%)
821 1
 
< 0.1%
803 1
 
< 0.1%
801 1
 
< 0.1%
797.5 1
 
< 0.1%
795.5 1
 
< 0.1%
788.5 2
< 0.1%
783.5 1
 
< 0.1%
781.5 3
< 0.1%
781 1
 
< 0.1%
780.5 1
 
< 0.1%

NU_NOTA_MT
Real number (ℝ)

High correlation 

Distinct1608
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean364.94704
Minimum-1
Maximum958.5
Zeros4130
Zeros (%)0.4%
Negative315826
Negative (%)31.6%
Memory size7.6 MiB
2025-04-20T10:02:59.536057image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median438.25
Q3579
95-th percentile729.5
Maximum958.5
Range959.5
Interquartile range (IQR)580

Descriptive statistics

Standard deviation271.35744
Coefficient of variation (CV)0.74355291
Kurtosis-1.3437577
Mean364.94704
Median Absolute Deviation (MAD)185.75
Skewness-0.3100719
Sum3.6494704 × 108
Variance73634.859
MonotonicityNot monotonic
2025-04-20T10:02:59.740692image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1 315826
31.6%
0 4130
 
0.4%
515.5 944
 
0.1%
517 928
 
0.1%
519 925
 
0.1%
516.5 917
 
0.1%
528 916
 
0.1%
518 913
 
0.1%
520.5 909
 
0.1%
520 908
 
0.1%
Other values (1598) 672684
67.3%
ValueCountFrequency (%)
-1 315826
31.6%
0 4130
 
0.4%
319.75 1
 
< 0.1%
321 1
 
< 0.1%
322.75 2
 
< 0.1%
323.25 1
 
< 0.1%
324 1
 
< 0.1%
324.5 1
 
< 0.1%
325 2
 
< 0.1%
325.5 2
 
< 0.1%
ValueCountFrequency (%)
958.5 118
< 0.1%
948 7
 
< 0.1%
946.5 23
 
< 0.1%
945.5 8
 
< 0.1%
945 20
 
< 0.1%
943.5 84
< 0.1%
941 10
 
< 0.1%
940 2
 
< 0.1%
939 16
 
< 0.1%
938.5 1
 
< 0.1%

NU_NOTA_REDACAO
Real number (ℝ)

High correlation  Zeros 

Distinct51
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean442.82207
Minimum-1
Maximum1000
Zeros29787
Zeros (%)3.0%
Negative282635
Negative (%)28.3%
Memory size7.6 MiB
2025-04-20T10:02:59.978969image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median540
Q3700
95-th percentile920
Maximum1000
Range1001
Interquartile range (IQR)701

Descriptive statistics

Standard deviation332.50011
Coefficient of variation (CV)0.75086617
Kurtosis-1.3830158
Mean442.82207
Median Absolute Deviation (MAD)240
Skewness-0.24522614
Sum4.4282206 × 108
Variance110556.32
MonotonicityNot monotonic
2025-04-20T10:03:00.246953image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1 282635
28.3%
560 38855
 
3.9%
600 38715
 
3.9%
580 32057
 
3.2%
640 31463
 
3.1%
520 30090
 
3.0%
0 29787
 
3.0%
540 26941
 
2.7%
620 26794
 
2.7%
680 26173
 
2.6%
Other values (41) 436490
43.6%
ValueCountFrequency (%)
-1 282635
28.3%
0 29787
 
3.0%
40 27
 
< 0.1%
60 19
 
< 0.1%
80 34
 
< 0.1%
100 21
 
< 0.1%
120 52
 
< 0.1%
140 49
 
< 0.1%
160 176
 
< 0.1%
180 212
 
< 0.1%
ValueCountFrequency (%)
1000 13
 
< 0.1%
980 3195
 
0.3%
960 11069
1.1%
940 17361
1.7%
920 22420
2.2%
900 18706
1.9%
880 22701
2.3%
860 16514
1.7%
840 21169
2.1%
820 15783
1.6%

TP_PRESENCA_GERAL
Categorical

High correlation 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
1
680571 
0
279032 
2
 
36794
3
 
3603

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 680571
68.1%
0 279032
27.9%
2 36794
 
3.7%
3 3603
 
0.4%

Length

2025-04-20T10:03:00.464564image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:03:00.645766image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 680571
68.1%
0 279032
27.9%
2 36794
 
3.7%
3 3603
 
0.4%

Most occurring characters

ValueCountFrequency (%)
1 680571
68.1%
0 279032
27.9%
2 36794
 
3.7%
3 3603
 
0.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 680571
68.1%
0 279032
27.9%
2 36794
 
3.7%
3 3603
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 680571
68.1%
0 279032
27.9%
2 36794
 
3.7%
3 3603
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 680571
68.1%
0 279032
27.9%
2 36794
 
3.7%
3 3603
 
0.4%

TP_ANO_CONCLUIU
Real number (ℝ)

High correlation  Zeros 

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.445723
Minimum0
Maximum17
Zeros569476
Zeros (%)56.9%
Negative0
Negative (%)0.0%
Memory size7.6 MiB
2025-04-20T10:03:00.807062image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33
95-th percentile15
Maximum17
Range17
Interquartile range (IQR)3

Descriptive statistics

Standard deviation4.446031
Coefficient of variation (CV)1.81788
Kurtosis3.6962453
Mean2.445723
Median Absolute Deviation (MAD)0
Skewness2.1454808
Sum2445723
Variance19.767192
MonotonicityNot monotonic
2025-04-20T10:03:01.025076image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
0 569476
56.9%
1 106613
 
10.7%
2 67116
 
6.7%
17 42343
 
4.2%
3 38173
 
3.8%
4 34853
 
3.5%
5 26736
 
2.7%
6 21764
 
2.2%
7 16676
 
1.7%
8 13963
 
1.4%
Other values (8) 62287
 
6.2%
ValueCountFrequency (%)
0 569476
56.9%
1 106613
 
10.7%
2 67116
 
6.7%
3 38173
 
3.8%
4 34853
 
3.5%
5 26736
 
2.7%
6 21764
 
2.2%
7 16676
 
1.7%
8 13963
 
1.4%
9 11897
 
1.2%
ValueCountFrequency (%)
17 42343
4.2%
16 5164
 
0.5%
15 5372
 
0.5%
14 6256
 
0.6%
13 6946
 
0.7%
12 7298
 
0.7%
11 9141
 
0.9%
10 10213
 
1.0%
9 11897
 
1.2%
8 13963
 
1.4%

NU_DESEMPENHO
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
2
454653 
3
361569 
1
183778 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
2 454653
45.5%
3 361569
36.2%
1 183778
18.4%

Length

2025-04-20T10:03:01.234283image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:03:01.416088image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
2 454653
45.5%
3 361569
36.2%
1 183778
18.4%

Most occurring characters

ValueCountFrequency (%)
2 454653
45.5%
3 361569
36.2%
1 183778
18.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 454653
45.5%
3 361569
36.2%
1 183778
18.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 454653
45.5%
3 361569
36.2%
1 183778
18.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 454653
45.5%
3 361569
36.2%
1 183778
18.4%

NU_INFRAESTRUTURA
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.6 MiB
1
366709 
2
343423 
3
289868 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row1
3rd row2
4th row2
5th row3

Common Values

ValueCountFrequency (%)
1 366709
36.7%
2 343423
34.3%
3 289868
29.0%

Length

2025-04-20T10:03:01.620944image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:03:01.791403image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 366709
36.7%
2 343423
34.3%
3 289868
29.0%

Most occurring characters

ValueCountFrequency (%)
1 366709
36.7%
2 343423
34.3%
3 289868
29.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 366709
36.7%
2 343423
34.3%
3 289868
29.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 366709
36.7%
2 343423
34.3%
3 289868
29.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 366709
36.7%
2 343423
34.3%
3 289868
29.0%

NU_MEDIA_GERAL
Real number (ℝ)

High correlation 

Distinct2939
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean378.64485
Minimum-1
Maximum848
Zeros4
Zeros (%)< 0.1%
Negative279349
Negative (%)27.9%
Memory size7.6 MiB
2025-04-20T10:03:01.998223image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median480
Q3572.5
95-th percentile681.5
Maximum848
Range849
Interquartile range (IQR)573.5

Descriptive statistics

Standard deviation254.75242
Coefficient of variation (CV)0.67280043
Kurtosis-1.2429281
Mean378.64485
Median Absolute Deviation (MAD)127
Skewness-0.57788713
Sum3.7864485 × 108
Variance64898.795
MonotonicityNot monotonic
2025-04-20T10:03:02.256733image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1 279032
 
27.9%
522 1589
 
0.2%
540 1586
 
0.2%
536 1579
 
0.2%
547 1570
 
0.2%
533 1561
 
0.2%
535 1559
 
0.2%
541 1557
 
0.2%
513 1551
 
0.2%
517 1544
 
0.2%
Other values (2929) 706872
70.7%
ValueCountFrequency (%)
-1 279032
27.9%
-0.6 72
 
< 0.1%
-0.4 245
 
< 0.1%
0 4
 
< 0.1%
39.6 1
 
< 0.1%
47.6 1
 
< 0.1%
51.6 2
 
< 0.1%
55.6 2
 
< 0.1%
57.06 2
 
< 0.1%
57.2 1
 
< 0.1%
ValueCountFrequency (%)
848 2
< 0.1%
847 1
< 0.1%
844.5 1
< 0.1%
843.5 1
< 0.1%
842 1
< 0.1%
840.5 2
< 0.1%
839.5 1
< 0.1%
837 1
< 0.1%
836.5 1
< 0.1%
832 2
< 0.1%

Q001
Categorical

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size977.0 KiB
E
283473 
B
178231 
C
129969 
D
111234 
H
102520 
Other values (3)
194573 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowF
3rd rowD
4th rowB
5th rowB

Common Values

ValueCountFrequency (%)
E 283473
28.3%
B 178231
17.8%
C 129969
13.0%
D 111234
 
11.1%
H 102520
 
10.3%
F 84967
 
8.5%
G 65278
 
6.5%
A 44328
 
4.4%

Length

2025-04-20T10:03:02.490809image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:03:02.677340image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
e 283473
28.3%
b 178231
17.8%
c 129969
13.0%
d 111234
 
11.1%
h 102520
 
10.3%
f 84967
 
8.5%
g 65278
 
6.5%
a 44328
 
4.4%

Most occurring characters

ValueCountFrequency (%)
E 283473
28.3%
B 178231
17.8%
C 129969
13.0%
D 111234
 
11.1%
H 102520
 
10.3%
F 84967
 
8.5%
G 65278
 
6.5%
A 44328
 
4.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E 283473
28.3%
B 178231
17.8%
C 129969
13.0%
D 111234
 
11.1%
H 102520
 
10.3%
F 84967
 
8.5%
G 65278
 
6.5%
A 44328
 
4.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E 283473
28.3%
B 178231
17.8%
C 129969
13.0%
D 111234
 
11.1%
H 102520
 
10.3%
F 84967
 
8.5%
G 65278
 
6.5%
A 44328
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E 283473
28.3%
B 178231
17.8%
C 129969
13.0%
D 111234
 
11.1%
H 102520
 
10.3%
F 84967
 
8.5%
G 65278
 
6.5%
A 44328
 
4.4%

Q002
Categorical

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size977.0 KiB
E
350191 
B
128472 
D
120093 
F
116429 
G
112447 
Other values (3)
172368 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowF
3rd rowC
4th rowB
5th rowB

Common Values

ValueCountFrequency (%)
E 350191
35.0%
B 128472
 
12.8%
D 120093
 
12.0%
F 116429
 
11.6%
G 112447
 
11.2%
C 110653
 
11.1%
H 33453
 
3.3%
A 28262
 
2.8%

Length

2025-04-20T10:03:02.906166image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:03:03.104722image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
e 350191
35.0%
b 128472
 
12.8%
d 120093
 
12.0%
f 116429
 
11.6%
g 112447
 
11.2%
c 110653
 
11.1%
h 33453
 
3.3%
a 28262
 
2.8%

Most occurring characters

ValueCountFrequency (%)
E 350191
35.0%
B 128472
 
12.8%
D 120093
 
12.0%
F 116429
 
11.6%
G 112447
 
11.2%
C 110653
 
11.1%
H 33453
 
3.3%
A 28262
 
2.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E 350191
35.0%
B 128472
 
12.8%
D 120093
 
12.0%
F 116429
 
11.6%
G 112447
 
11.2%
C 110653
 
11.1%
H 33453
 
3.3%
A 28262
 
2.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E 350191
35.0%
B 128472
 
12.8%
D 120093
 
12.0%
F 116429
 
11.6%
G 112447
 
11.2%
C 110653
 
11.1%
H 33453
 
3.3%
A 28262
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E 350191
35.0%
B 128472
 
12.8%
D 120093
 
12.0%
F 116429
 
11.6%
G 112447
 
11.2%
C 110653
 
11.1%
H 33453
 
3.3%
A 28262
 
2.8%

Q005
Real number (ℝ)

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.68914
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.6 MiB
2025-04-20T10:03:03.331563image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median4
Q34
95-th percentile6
Maximum20
Range19
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.4090862
Coefficient of variation (CV)0.38195521
Kurtosis5.8725695
Mean3.68914
Median Absolute Deviation (MAD)1
Skewness1.1397313
Sum3689140
Variance1.985524
MonotonicityNot monotonic
2025-04-20T10:03:03.551084image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
4 317410
31.7%
3 275886
27.6%
2 144434
14.4%
5 144060
14.4%
6 49453
 
4.9%
1 37149
 
3.7%
7 18083
 
1.8%
8 7479
 
0.7%
9 2807
 
0.3%
10 1720
 
0.2%
Other values (10) 1519
 
0.2%
ValueCountFrequency (%)
1 37149
 
3.7%
2 144434
14.4%
3 275886
27.6%
4 317410
31.7%
5 144060
14.4%
6 49453
 
4.9%
7 18083
 
1.8%
8 7479
 
0.7%
9 2807
 
0.3%
10 1720
 
0.2%
ValueCountFrequency (%)
20 135
 
< 0.1%
19 13
 
< 0.1%
18 24
 
< 0.1%
17 25
 
< 0.1%
16 35
 
< 0.1%
15 101
 
< 0.1%
14 101
 
< 0.1%
13 165
 
< 0.1%
12 373
< 0.1%
11 547
0.1%

Q006
Categorical

High correlation 

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size977.4 KiB
B
315588 
C
166128 
D
111331 
E
74994 
A
68034 
Other values (12)
263925 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowD
3rd rowB
4th rowC
5th rowB

Common Values

ValueCountFrequency (%)
B 315588
31.6%
C 166128
16.6%
D 111331
 
11.1%
E 74994
 
7.5%
A 68034
 
6.8%
G 66038
 
6.6%
F 43775
 
4.4%
H 35309
 
3.5%
I 21953
 
2.2%
J 19391
 
1.9%
Other values (7) 77459
 
7.7%

Length

2025-04-20T10:03:03.762183image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
b 315588
31.6%
c 166128
16.6%
d 111331
 
11.1%
e 74994
 
7.5%
a 68034
 
6.8%
g 66038
 
6.6%
f 43775
 
4.4%
h 35309
 
3.5%
i 21953
 
2.2%
j 19391
 
1.9%
Other values (7) 77459
 
7.7%

Most occurring characters

ValueCountFrequency (%)
B 315588
31.6%
C 166128
16.6%
D 111331
 
11.1%
E 74994
 
7.5%
A 68034
 
6.8%
G 66038
 
6.6%
F 43775
 
4.4%
H 35309
 
3.5%
I 21953
 
2.2%
J 19391
 
1.9%
Other values (7) 77459
 
7.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
B 315588
31.6%
C 166128
16.6%
D 111331
 
11.1%
E 74994
 
7.5%
A 68034
 
6.8%
G 66038
 
6.6%
F 43775
 
4.4%
H 35309
 
3.5%
I 21953
 
2.2%
J 19391
 
1.9%
Other values (7) 77459
 
7.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
B 315588
31.6%
C 166128
16.6%
D 111331
 
11.1%
E 74994
 
7.5%
A 68034
 
6.8%
G 66038
 
6.6%
F 43775
 
4.4%
H 35309
 
3.5%
I 21953
 
2.2%
J 19391
 
1.9%
Other values (7) 77459
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
B 315588
31.6%
C 166128
16.6%
D 111331
 
11.1%
E 74994
 
7.5%
A 68034
 
6.8%
G 66038
 
6.6%
F 43775
 
4.4%
H 35309
 
3.5%
I 21953
 
2.2%
J 19391
 
1.9%
Other values (7) 77459
 
7.7%

Q025
Categorical

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size976.8 KiB
B
904861 
A
95139 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1000000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowB
3rd rowB
4th rowB
5th rowB

Common Values

ValueCountFrequency (%)
B 904861
90.5%
A 95139
 
9.5%

Length

2025-04-20T10:03:03.947181image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-04-20T10:03:04.103741image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
b 904861
90.5%
a 95139
 
9.5%

Most occurring characters

ValueCountFrequency (%)
B 904861
90.5%
A 95139
 
9.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
B 904861
90.5%
A 95139
 
9.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
B 904861
90.5%
A 95139
 
9.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1000000
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
B 904861
90.5%
A 95139
 
9.5%

Interactions

2025-04-20T10:02:38.202694image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:34.969589image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:40.299794image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:44.731254image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:50.603361image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:58.154912image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:06.746631image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:13.967698image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:21.344321image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:29.595901image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:39.216721image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:35.915545image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:40.844333image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:45.103307image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:51.431257image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:58.996335image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:07.452679image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:14.703415image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:22.471614image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:30.388558image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:40.265115image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:36.415699image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:41.357759image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:45.484414image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:52.176015image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:59.851802image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:08.233265image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:15.408353image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:23.215209image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:31.197374image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:41.534983image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:36.805750image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:41.983611image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:45.888967image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:52.850287image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:00.792495image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:08.959152image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:16.125780image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:23.940844image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:31.997418image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:42.491978image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:37.339742image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:42.383356image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:46.239229image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:53.571702image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:01.727516image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:09.671033image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:16.833558image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:24.683353image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:32.790600image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:43.447795image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:37.830829image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:42.772646image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:46.798685image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:54.300982image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:02.598316image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:10.332933image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:17.545979image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:25.417525image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:33.601461image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:44.100850image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:38.325734image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:43.235577image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:47.544851image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:55.001169image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:03.430185image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:11.041005image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:18.242824image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:26.200032image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:34.468876image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:44.552351image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:38.744092image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:43.662937image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:48.294137image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:55.732920image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:04.301958image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:11.758718image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:19.070057image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:27.011748image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:35.342089image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:44.942855image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:39.223989image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:44.043728image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:49.107519image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:56.547287image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:05.183463image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:12.522402image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:19.833360image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:27.864652image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:36.291012image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:45.309884image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:39.690643image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:44.386303image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:49.884336image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:01:57.320055image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:05.959524image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:13.203096image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:20.551319image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:28.713644image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2025-04-20T10:02:37.177296image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2025-04-20T10:03:04.260494image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
NU_DESEMPENHONU_INFRAESTRUTURANU_MEDIA_GERALNU_NOTA_CHNU_NOTA_CNNU_NOTA_LCNU_NOTA_MTNU_NOTA_REDACAOQ001Q002Q005Q006Q025SG_UF_PROVATP_ANO_CONCLUIUTP_COR_RACATP_DEPENDENCIA_ADM_ESCTP_ESTADO_CIVILTP_FAIXA_ETARIATP_PRESENCA_CHTP_PRESENCA_CNTP_PRESENCA_GERALTP_PRESENCA_LCTP_PRESENCA_MTTP_SEXOTP_ST_CONCLUSAO
NU_DESEMPENHO1.0000.2190.9530.7560.7810.7410.8190.7760.2420.2400.0850.2770.1260.1010.1220.1470.1970.0880.1920.5900.6350.6410.5900.6350.0360.130
NU_INFRAESTRUTURA0.2191.0000.2410.2120.2160.2150.2510.1910.3190.3030.1700.5280.4230.2780.1210.2190.1720.0590.1980.1140.1130.1160.1140.1130.0990.150
NU_MEDIA_GERAL0.9530.2411.0000.9180.9130.9100.9360.9240.1430.1400.0170.1490.1490.062-0.143-0.1680.1510.070-0.2820.7020.6810.7500.7020.6810.0500.146
NU_NOTA_CH0.7560.2120.9181.0000.8370.9180.8370.8160.1210.1190.0010.1450.1350.074-0.107-0.1620.1230.063-0.2280.7050.6420.5790.7050.6420.0670.147
NU_NOTA_CN0.7810.2160.9130.8371.0000.8270.8820.7740.1260.1220.0070.1540.1300.069-0.109-0.1510.1310.060-0.2240.6350.7000.5720.6350.7000.1190.133
NU_NOTA_LC0.7410.2150.9100.9180.8271.0000.8290.8140.1220.1210.0030.1440.1420.078-0.122-0.1670.1170.065-0.2490.7060.6430.5800.7060.6430.0440.143
NU_NOTA_MT0.8190.2510.9360.8370.8820.8291.0000.7980.1440.1390.0220.1770.1500.080-0.148-0.1630.1430.065-0.2800.6350.7000.5720.6350.7000.1420.128
NU_NOTA_REDACAO0.7760.1910.9240.8160.7740.8140.7981.0000.1180.1220.0370.1120.1210.043-0.178-0.1400.1340.077-0.3170.6580.6140.5420.6580.6140.0850.129
Q0010.2420.3190.1430.1210.1260.1220.1440.1181.0000.3410.0490.2380.1940.0880.0780.1150.1370.0800.1260.1230.1230.1030.1230.1230.0690.134
Q0020.2400.3030.1400.1190.1220.1210.1390.1220.3411.0000.0500.2130.1940.0800.0890.1070.1320.1000.1490.1330.1340.1120.1330.1340.0720.143
Q0050.0850.1700.0170.0010.0070.0030.0220.0370.0490.0501.0000.0630.0910.056-0.1660.0550.0590.051-0.1820.0710.0710.0600.0710.0710.0180.113
Q0060.2770.5280.1490.1450.1540.1440.1770.1120.2380.2130.0631.0000.3220.1020.0430.1470.1660.0260.0820.1140.1150.0960.1140.1150.1090.120
Q0250.1260.4230.1490.1350.1300.1420.1500.1210.1940.1940.0910.3221.0000.2140.0360.1370.0790.0130.0830.0620.0610.0630.0620.0610.0390.046
SG_UF_PROVA0.1010.2780.0620.0740.0690.0780.0800.0430.0880.0800.0560.1020.2141.0000.0460.1760.1190.0340.0640.0480.0500.0460.0480.0500.0390.102
TP_ANO_CONCLUIU0.1220.121-0.143-0.107-0.109-0.122-0.148-0.1780.0780.089-0.1660.0430.0360.0461.0000.0630.1960.2160.7460.1530.1410.1260.1530.1410.0130.414
TP_COR_RACA0.1470.219-0.168-0.162-0.151-0.167-0.163-0.1400.1150.1070.0550.1470.1370.1760.0631.0000.0740.0420.1190.0660.0650.0550.0660.0650.0190.072
TP_DEPENDENCIA_ADM_ESC0.1970.1720.1510.1230.1310.1170.1430.1340.1370.1320.0590.1660.0790.1190.1960.0741.0000.0580.1990.1000.1050.0860.1000.1050.0750.441
TP_ESTADO_CIVIL0.0880.0590.0700.0630.0600.0650.0650.0770.0800.1000.0510.0260.0130.0340.2160.0420.0581.0000.2710.0880.0840.0730.0880.0840.0170.124
TP_FAIXA_ETARIA0.1920.198-0.282-0.228-0.224-0.249-0.280-0.3170.1260.149-0.1820.0820.0830.0640.7460.1190.1990.2711.0000.2100.1990.1750.2100.1990.0280.522
TP_PRESENCA_CH0.5900.1140.7020.7050.6350.7060.6350.6580.1230.1330.0710.1140.0620.0480.1530.0660.1000.0880.2101.0000.6420.7091.0000.6420.0100.167
TP_PRESENCA_CN0.6350.1130.6810.6420.7000.6430.7000.6140.1230.1340.0710.1150.0610.0500.1410.0650.1050.0840.1990.6421.0000.7120.6421.0000.0070.152
TP_PRESENCA_GERAL0.6410.1160.7500.5790.5720.5800.5720.5420.1030.1120.0600.0960.0630.0460.1260.0550.0860.0730.1750.7090.7121.0000.7090.7120.0120.142
TP_PRESENCA_LC0.5900.1140.7020.7050.6350.7060.6350.6580.1230.1330.0710.1140.0620.0480.1530.0660.1000.0880.2101.0000.6420.7091.0000.6420.0100.167
TP_PRESENCA_MT0.6350.1130.6810.6420.7000.6430.7000.6140.1230.1340.0710.1150.0610.0500.1410.0650.1050.0840.1990.6421.0000.7120.6421.0000.0070.152
TP_SEXO0.0360.0990.0500.0670.1190.0440.1420.0850.0690.0720.0180.1090.0390.0390.0130.0190.0750.0170.0280.0100.0070.0120.0100.0071.0000.047
TP_ST_CONCLUSAO0.1300.1500.1460.1470.1330.1430.1280.1290.1340.1430.1130.1200.0460.1020.4140.0720.4410.1240.5220.1670.1520.1420.1670.1520.0471.000

Missing values

2025-04-20T10:02:45.658535image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2025-04-20T10:02:48.243884image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

TP_FAIXA_ETARIATP_SEXOTP_ESTADO_CIVILTP_COR_RACATP_DEPENDENCIA_ADM_ESCTP_ST_CONCLUSAOSG_UF_PROVATP_PRESENCA_CNTP_PRESENCA_CHTP_PRESENCA_LCTP_PRESENCA_MTNU_NOTA_CNNU_NOTA_CHNU_NOTA_LCNU_NOTA_MTNU_NOTA_REDACAOTP_PRESENCA_GERALTP_ANO_CONCLUIUNU_DESEMPENHONU_INFRAESTRUTURANU_MEDIA_GERALQ001Q002Q005Q006Q025
03F112.02RJ1111409.00527.00418.00416.507201023498.0CC3BB
12F11-1.03SP1111499.50535.50549.00570.505601021543.0FF3DB
28M12-1.01MA1111425.25391.75446.00503.504601022445.2DC2BB
312M11-1.01PE1111621.00584.50493.00412.7582011622586.0BB4CB
42F112.02PR1111445.00458.00457.25491.504001023450.2BB4BB
53F13-1.02BA0000-1.00-1.00-1.00-1.00-10033-1.0CE2CB
611M23-1.01SP1111617.50673.00705.00724.5060011112664.0CC4EB
72M133.02SP1111407.25583.00479.50601.506601021546.0DE3EB
812F02-1.01MS1111518.00630.00611.00538.005801721575.5HE4CB
911F13-1.01SP0000-1.00-1.00-1.00-1.00-101132-1.0BC3GB
TP_FAIXA_ETARIATP_SEXOTP_ESTADO_CIVILTP_COR_RACATP_DEPENDENCIA_ADM_ESCTP_ST_CONCLUSAOSG_UF_PROVATP_PRESENCA_CNTP_PRESENCA_CHTP_PRESENCA_LCTP_PRESENCA_MTNU_NOTA_CNNU_NOTA_CHNU_NOTA_LCNU_NOTA_MTNU_NOTA_REDACAOTP_PRESENCA_GERALTP_ANO_CONCLUIUNU_DESEMPENHONU_INFRAESTRUTURANU_MEDIA_GERALQ001Q002Q005Q006Q025
9999901M13-1.03GO1111549.00563.00589.50726.508401011653.5FG4HB
9999911F13-1.03PA0000-1.00-1.00-1.00-1.00-10033-1.0DE3AB
9999922M13-1.03PE1111491.50512.50584.50643.505601022558.5CE5DB
9999932F11-1.03BA1111479.50548.50533.50420.256201022520.5EE5CB
9999942F10-1.03SP0000-1.00-1.00-1.00-1.00-10032-1.0DE6BB
9999952M14-1.03PI0000-1.00-1.00-1.00-1.00-10032-1.0HE4BB
99999611M33-1.01PA1111497.25334.25480.75441.755401023458.8DG3BB
9999973F12-1.01PR0000-1.00-1.00-1.00-1.00-10131-1.0GC4HB
9999984M13-1.01AP1111439.00390.25454.25423.7501131341.5EB6GB
9999993F11-1.01SP1111486.25510.50460.50519.505201122499.2DE2EB

Duplicate rows

Most frequently occurring

TP_FAIXA_ETARIATP_SEXOTP_ESTADO_CIVILTP_COR_RACATP_DEPENDENCIA_ADM_ESCTP_ST_CONCLUSAOSG_UF_PROVATP_PRESENCA_CNTP_PRESENCA_CHTP_PRESENCA_LCTP_PRESENCA_MTNU_NOTA_CNNU_NOTA_CHNU_NOTA_LCNU_NOTA_MTNU_NOTA_REDACAOTP_PRESENCA_GERALTP_ANO_CONCLUIUNU_DESEMPENHONU_INFRAESTRUTURANU_MEDIA_GERALQ001Q002Q005Q006Q025# duplicates
49033M132.02CE0000-1.0-1.0-1.0-1.0-10033-1.0HH4BB30
35373F132.02CE0000-1.0-1.0-1.0-1.0-10033-1.0HH3BB24
48983M132.02CE0000-1.0-1.0-1.0-1.0-10033-1.0HH3BB22
25423F112.02SP0000-1.0-1.0-1.0-1.0-10031-1.0EE4DB20
35433F132.02CE0000-1.0-1.0-1.0-1.0-10033-1.0HH4BB20
48513M132.02CE0000-1.0-1.0-1.0-1.0-10033-1.0BB4BB20
17102M132.02CE0000-1.0-1.0-1.0-1.0-10033-1.0HH3BB19
49023M132.02CE0000-1.0-1.0-1.0-1.0-10033-1.0HH4BA18
48493M132.02CE0000-1.0-1.0-1.0-1.0-10033-1.0BB3BB16
48663M132.02CE0000-1.0-1.0-1.0-1.0-10033-1.0CC4BA16